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Abstract 

In this paper, we analyse the dual pivot Quicksort, a variant of the standard 
Quicksort algorithm, in which two pivots are used for the partitioning of the 
array. We are solving recurrences of the expected number of key comparisons 
and exchanges performed by the algorithm, obtaining the exact and asymp¬ 
totic total average values contributing to its time complexity. Further, we 
compute the average number of partitioning stages and the variance of the 
number of key comparisons. In terms of mean values, dual pivot Quicksort 
does not appear to be faster than ordinary algorithm. 


1 Introduction 

Quicksort is a sorting algorithm with an extensive literature regarding its mathe¬ 
matical analysis and its applications. Without loss of generality, suppose that we 
want to quick sort a random permutation of distinct keys {1, 2, ..., n} with all the 
n\ permutations equally likely. A key is randomly chosen as pivot and by pairwise 
comparisons of the other elements with it, smaller keys are placed to left and greater 
to right. Now, the pivot j is at its final position and the algorithm is recursively 
invoked to sort independently the left and right subarrays of (j — 1) and (n — j) 
elements, respectively. Letting C n being the number of comparisons for sorting n 
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keys, its average value is given by the following recursive relation 


E(C n ) =n- 1 + - ^(E(Q_!) + E(C n _j)) 

11 3=1 
2 n 

— n — 1 H— \ E(Cj_i), 

j=i 

with initial condition C 0 0. Subtracting (n — l)E(C n _i) from nE(C n ) and tele¬ 
scoping, the average number of comparisons is 

E(C n ) = 2 (n + 1 )H n — 4n ~ 2nln(n). (1.1) 

Similarly, it is a routine matter to compute that the average number of exchanges 
performed is 


2(n + l)H n — 3n 1 

----7iln(n), 

O 6 

when at the first partitioning stage, the expected number of exchanges is 

n 5 . , 

6 + 6n’ 


( 1 . 2 ) 


Recall that H n is the n t h harmonic number, defined by H n := 


Ai 


3 =1 


- and H 0 := 0. 
J 


Further, the sign ~ denotes asymptotic equivalence between two functions f(n) and 

f (^2-) 

g(n). That is f(n) ~ g(n) if and only if lim = 1. (In [2], [4], it is suggested 

n ->oc g[n) 

that small segments of size less than some parameter m be sorted by a simpler 
algorithm, such as insertion sort, as this is in practice quicker: in order to simplify 
the calculations for the solutions of the recurrences, we assume that m = 0). 


2 Partitioning on two pivots 

The idea for this variant is that we can randomly select two elements as pivots 
for the partitioning of the array. The number of comparisons obeys the following 
recursive rule; 

CJ n = “Number of comparisons during first partitioning stage ” + Cj-i + Cj-i-i + C n ~j 

since at the beginning, the pivots are compared each other and swapped if they 
are not in order. If elements i < j are selected, the array is partitioned into three 
subarrays: one with (i — 1) keys smaller than i, a subarray of (j — i—1) keys between 
two pivots and the part of (n — j) elements greater than j. The algorithm then is 
recursively applied to each of these subarrays. The number of comparisons during 
the first stage is 

A n = 1 + ( (i — 1 ) + 2 (j — i — 1 ) + 2 (n — j) \ , i — 1 , ..., n — 1 and j — i + 1,..., n 
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because if an element is lower than i, then it is less than j automatically, so i — 1 
elements beneath i only need to be compared with one of the pivots. However if an 
element is greater than i then it needs to be compared with the other pivot as well, 
to determine whether or not it is greater than j. We refer to Sedgewick [4] for code 
for a version of this scheme. The average value of A n is 


n —1 n 


EEI l + (* —l) + 2(j-i —l) + 2 (n-j)) = - kEEf"-'-^ 

i —1 j=i -\-1 '■ 


n— 1 n 


f 


i=l j=i -\-1 

2 /5 o 2 7 \ bn — 7 

— -— -n 3 - 2 n + ~n = —-—. 

n{n — 1) \6 6 / 3 

Hence, the recurrence for the expected number of comparisons is 

/ n— 1 n n— 1 n n— 1 n 


E(C n ) = 


bn — 7 


n(n — 1) 


£ £ E(a-i) + £ £ Etc,-,.,) + £ £ e(c„_,i 

K i= 1 j=i +1 i =1 j=i- 1-1 i =1 j=i+l 


Note that the three double sums above are equal. Therefore, the recurrence becomes 


E(C n ) = 


bn — 7 


n— 1 


— £(n-i)E(C, 


n(n — 1) 


i-li 


Letting a n = E(C n ), we have 


bn — 7 


71—1 


& n 


—j 77 1) 71 > 2. 


n(n — 1) ' 

v ' i= 1 


Trivially, it holds that a 0 = a i = 0. Multiplying both sides by ( ], we obtain 


n 


d n 


n\ I bn — 7 6 v-4. .. 

+ —-7T 2^( n “ *)«*-! 


3 n(n — 1) ' 

v 7 1=1 


n(n — l)(bn — 7) 
6 


71—1 


3^(n - 


1=1 


We introduce the difference operator A for the solution of this recurrence. The 
operator is defined by 

A F(n) := F(n + 1) — F(n) and for higher orders 
A k F(n) := A k ~ l F{n + 1) - A fc “ 1 F(n). 


Thus, we have 
A 

A 2 


/n\ _ (n + 1\ 

V2j an_ V 2 J a ” +1 


a„ = A 


n + 1 


5n 2 — 3n ^ 

a, n = -^-^ 3 / / 

i =o 

\ 
n 


^n+i — A I a n — 5~t - 1 ~I - 3a, 
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By definition, 


A L ( 2 ja n = A 


71+1 


^n+1 ^ ( 2 ) ^n 


n + 2 


®n+2 2 


77+1 


®n+l + o a 


and the recurrence becomes 

(n + 1) (77 + 2)a n+2 — 277(77 + l)a n+ i + 77(77 — 1 ) 0 ,,, = 2(5n + 1 + 3a n ) 

=>■ (n + 1 )((77 + 2)a n+2 - (77 - 2)a n+ i) - (77 + 2) ((77 + l)a n+1 - (n - 3)a n ) = 2(5 t7 + 1). 

Dividing by (77 + 1)(t 7 + 2), we obtain the telescoping recurrence 

(77 + 2)q n+2 - (77 - 2)q n+ i _ (77 + l)a n +i - (n - 3)a n 2(5 t7 + 1) 

77 + 2 77 + 1 (-77 + 1) (77 + 2) ’ 

which yields 


(77 + 2)a n+ 2 — ('77 — 2)a n+ i _ ^ ■ sr -' 5 j + 1 

77 + 2 ( 'i 4- 1111 - 


(j + l)(j + 2) '77 + 2 


+ 10tf n+ i - 18. 


The recurrence is equivalent to 


na n — (77 — l)f/„. 1 = 18 + 10riH n _i — 18 77. 

1 . . . . ('77 — 1) ('77 — 2) ('77 — 3) . 

Multiplying by-—- , this recurrence is transformed to a telescop¬ 

ing one [4], 


Unwinding, we have 


®n-1 + 


18 ( '77 — 1 ) ( ’77 — 2) ( '77 — 3) 


+ 101 4 -18 




Using Maple, we found that 


itjj - !)(i - 2 )(.7 - 3 ) = 6^ 


and for the other sums in Eq. (2.1), 


A A + 1 


and ^L\A H 3 


'77+1 

5 


H n.-i- 1 _ 
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Therefore 


3= 1 


3 =i 
n + 1 
5 

n + 1 
5 


S’>hT fj v T >rX 


i=i 


3 = 


•A 1 


rt V 4 / 3 


II 


n +1 


II 


1 

24 


n+l 


3 = 1 

n\ 1 
4/4 


Now Eq. (2.1) becomes 
9 fn' 


&n. — 


2 \4 


9 


On — — + 10 | 


+ 10 


n + l 


n + l 
5 


II, 


n+l 


1 fn 


H, 


n + l 


4 \4 


18(n + 1) 


18 


n + l 


Finally, the expected number of comparisons, when two pivots are chosen is 

a n = 2(n + 1 )H n — 4n ~ 2n ln(n). (2.2) 

This is exactly the same as the expected number of comparisons in Eq. (1.1) 
computed earlier for ordinary Quicksort. The dual pivot variant is claimed to be 
faster in experimental measurements than the standard algorithm in [6]. A referee 
of this article commented to us that the variant gives a 30% performance boost on 
randomly permuted data. 

We proceed to compute the average number of exchanges. Letting S n denote 
the total number of exchanges we carry out when sorting n objects, we have that 


S n = “Number of exchanges during first partitioning stage” + Si-i + Sj-i-i + S n -j. 


Now it is fairly clear that, again using that the pivots are chosen uniformly at 
random, that the average values of last three quantities are equal. So our main 
objective now is to determine the average number of exchanges during the first 
partitioning stage. At the end of the partition routine, {i — 1) elements are less 
than pivot i. Thus, the contribution to the number of exchanges is [4], 



n —1 n 


i=l j=i-\-1 


1 n —1 

w ?=1 


1 / 71—1 71 — 1 

w 


The first sum being evaluated gives 


71—1 




n 3 — n 


6 
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and the second is just n(n — 1)/2. The average contribution is 

n 3 — n n(n — 1) \ n — 2 


n(n — 1) \ 6 2 

Similarly, the average number of exchanges for the (■n — j ) elements greater than 

/n — 2\ 

the second pivot is ( —-— j , since the double sums are equal. Adding the two final 
“exchanges” to get the pivots in place, the average number of exchanges during the 
partitioning routine is f - ^ . Therefore, the recurrence for the mean number 


of exchanges in course of the algorithm is 

/ n—1 n 7i—l n 71—1 7i 

£ £ E(s,_i) + £ £ + £ £ e 

v 7=1 jf =7+1 7=1 J=7+l 7=1 j =7+1 

71—1 71 


E(S n ) = 2(U „ + 1} + " 


3 ' n(n — 1) 

2 (n + 1) 6 

3 

2 (n + 1) 


m^i>E E 

V ' 7=1 j =7+1 


6 


71—1 




3 n(n — 1) -+' 

This recurrence is solved in [4]: here we present a solution using generating func¬ 
tions. Letting b n = E(S' n ), we have 

2 (n +1) 6 -m. 

b n = --—- + -s-— 2^(n - i)bi-1 . 


n(n — 1) ^ 

v ' %=i 


Multiplying by ( J to clear fractions, we have 


n 


n 


bn 


^ I 2 ( n+1 ) + _6_ ^ 

3 + n(n _ i) y< n % > b 

n(n — 1 )(n + 1) 


7—1 


- i)bi 


i —1 • 


7=1 


Multiplying by z n and summing over n, in order to obtain the generating function 

OO 

9(z) = b n z n of the coefficients b n , 


71=0 


71=0 


( 9 ) bnZU = q X^ n ( n - !)( n + !)*" + 3 X] XX™ “ *) 6 * 


2T 

T 


-i^ 


71=0 
2 00 


71=1 7=1 

OO 71 


n(n — l)& n z n 2 = ■£ X^ n ( n — l)( n + 2 + 3 ^ ^(n — 


i-i^ 


n=0 n=0 

v2 J2 / \ ,2 ;3 / 00 


71=1 7=1 


z 2 d+(E) z 2 d 3 


2 dz 2 3 d^ 3 


5> n+1 + 3 X^XX n_i ) 6i 


i-i^ • 


ili=0 


n =1 i=l 
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The first term on the right-hand side of the previous equation is the third order 
derivative of the following geometric series 


1 z n+1 = 


1-z 


dz 3 \1 — zJ (1 — z) 


L, M < I- 


The double sum is equal to 


'y ^ y — i)bi—\z n — b$z 2 + (26 0 + b\)z^ + (3&o + 2&i + & 2 )^ 4 + • • • 

n=l i=l 

— z^(bo + + ...) + 2z^(bQ + 

= (^ 2 + 2z 3 + 3^ 4 + .. .)g(z) 

= (f>” +1 'W 


The sum which multiplies g(z) on the last line is 


' nz n+l = 


1-z 


Therefore, our recurrence is transformed to the following differential equation 


z 2 d 2 g(z) _ 2z 2 

2 ~ dz 2 ~ (1-z) 


7 + 3g(z) 


1-z 


Changing variables v = 1 — z, we have that f(v) — g(l — v). Thus, it holds 

d k f(v) = . k d k g(l-v) 


The differential equation becomes 

(1 — v) 2 d 2 f(y) 2(1 -vf 


Using MAPLE, the general solution is 


1 — v 


f(v) = civ + — ~ 


3 C 2 20 ln(u) + 4 


, Ci, C2 € 


For the computation of constants, we consider the fact that /(1) = g(0) = 0 and 
/'(1) = —g'(0) = 0 (as bo — b\ — 0). The resulting system of linear equations in 

Ci and c 2 has solution (c 1? c 2 ) = (4, of . Therefore, the function is 


i 4 3 20 ln(i>) + 4 

/W = 25”-25^’ 
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But, since 


(1- z)~ 2 = y(n + l)z n and 


ln(u) 


n=0 


(i - z y 


In 


1-z 


this can be written as a product of the following two series; 


(i - z y 


In 


E 

<n=l 


nz 


n—1 


1 

rh 

' n 

<n=l 


= 1 + 2^ + 3z z + 4z 6 + ... z + — + — + — + . 


z 2 z 3 z 4 


= z 


2 + -z 2 + 3 + - + -z j + 4+- + - + -z 4 + ... 


= z+(H 1 + H 2 )z 2 + + H 2 + H 3 )z 3 + ... 

OO / 

= E ((«+ w* 


ii=0 


n z 


Extracting the coefficients and discarding terms for n < 3, the exact mean number 
of exchanges of dual pivot Quicksort is equal to, 


K = t ((n + 1 )H n - nj - — (n + 1) 

4. , 24n + 4 4 , . 

= —{n + l)H n -—--nln(n). 


(2.3) 


Comparing the expected number of comparisons of this variant with the standard 
algorithm, we see that they are identical. However, the mean number of exchanges 
is 2.4 times greater than the figure of normal Quicksort. 

In the lines that follow, we compute the average number of partitioning stages 
E(P n ) of dual pivot Quicksort. The recurrence is much simpler; 


Averaging over all 


n 


Pn ~ 1 + Pi -1 + Pj-i-l + Pn-r 


pairs of pivots i and j we have 


E(P„) — 1 + 


6 


n— 1 n 


^ -— y y e (Pi 

n(n — 1) P—' 

v > i=i j=i +1 


i—lj 


= 1 + 


6 


n—1 


^(n-f)E(P_i), 


n(n — 1) ^ 

v 1 i=i 

since the sums are equal. Again, we use generating functions for the solution of 

(X) 

the recurrence. Letting f(z) = ^^E(P„)^ n our recurrence is transformed to the 


following differential equation 

= 


n =0 


(1-^ 


+ 3 f(z) 


1-z 



Changing variables x = 1 — z we have h{x) = /(I — x) and the general solution is 


h(x) 


c 2 q 1 

^ + x ci - —. 


x z 


2x 


Since h( 1) = h'{ 1) = 0 the constants are ci = — and c 2 = Consequently, the 
mean number of partitioning stages is found to be equal to 


E(P„) = |(n+ 1) - i. 


(2.4) 


This is smaller than the expected number of stages of ordinary Quicksort, which is 
n, when there is no switch to straight insertion for the sorting of small subfiles, [4]. 


3 Variance 


Finally, we set up a recurrence for the computation of the variance of the number 
of comparisons of dual pivot Quicksort. Recall that 


A n — 2n — i — 2 and E(kL n ) = 


5 n — 7 


By the recurrence relation for the number of comparisons, we have 

^ n —1 n ^ n —1 n 

p(c n = t) = -y —y ^2 ^2 = t) = 22 X/ i + Cj-i-i + C n —j = t), 


i=l j=i+1 


n 


i=1 j=i-\-1 


noting that the resulting subarrays are independently sorted, the above is 


n —1 n 


EEERw-. = l)P(Cj-i -1 = m)P(C n -j = t — m — l — 2n + i + 2) j. 

i= 1 j=i+1 Z,m 


Letting f n (z) = E P(C n = t)z f be the ordinary probability generating function 


t =o 


for the number of comparisons needed to sort n keys, we obtain 


Ti—l n 


/»M = ttttE E * 2 ” ' 2 /i-iB/i+iW/.-i( ! ). 


(3.1) 


i=l j=i+l 
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It holds that /„( 1) = 1 and f' n ( 1) = 2 (n + 1 )H n — An. Moreover, the second order 
derivative of Eq. (3.1) evaluated at z — 1 is recursively given by 


m = 


n(n — 1) 

V 7 X 2=1 7=2 


E E < 2n < 2n - * - 2 ) 


j=i+1 


2=17=2+1 


2 EE (2n - i - 2)E(Cj_ 1 ) + 2 EE (2 n - i - 2)£?(C i _ i _ 1 ) 


2=1 7 = 2+1 


2=1 7 = 2+1 


2 EE (2ra - i - 2)E(C' n _ i ) + 2 EE EiCi-jEiC. 


'j-i-1) 


2=1 j=i-\-l 


2=1 7 = 2+1 


2 E E E(C i -i)E(C„- i ) + 2'Y E ®(C^w)B(C„ 


2=1 7 = 2+1 


2=1 


E E f<- iW+E E /E-.w+E E /"-hi) 


2=1 7 = 2+1 


2=1 7 = 2+1 


2=1 7 = 2+1 


The reader should not be discouraged by this long expression, since many of the 
sums are equal. Specifically, the fourth and fifth turn out to be equal and by simple 
manipulation of indices, the sums of the products of expected values are equal. The 
double sum of the product of the mean number of comparisons can be simplified as 
follows: 


EE £(C+ 1 )£(C' n _ J ) = E £(C+i)( E(C. 


2=1 7 = 2+1 


E —+ — 1))(2 


n — i + 1 


H n -i + 


n — i — 5 (n — if 


The next sum was computed using results from a paper [5], which contains inter¬ 
esting identities and properties of sums involving harmonic numbers. 


. (n — i + 1 


Hi-iH n _i — (* — 1) + 1 


n — i + 1 






E(® _ X )( n _ + E+ “ i)Hi-\H n 


The four sums can be evaluated using Corollary 3 in [5]. 
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After some computations in our Maple worksheet, the recurrence is 
/"(!) = 2(n + 1 )(n + 2 )(H* - H^) - H n (y n 2 + y n + 6^ + ^-n 2 + + y 

' i=i 


nyn 


where Hn^ is the second order harmonic number defined by Hn' 1 — • Letting 

k 2 


k =1 


Tl\ f Ti 1 ^ 

d n = /"(l) and subtracting ( J d n from f ) d n+1 , we have 


A f d n = 4 n(n + l)(n + 2)(i/ 2 - i7^ 2) ) - ^y-(84n 2 + 198n + 42) + 3 ^ rf*_i 


n, 


+A(79n 2 + 231n + 14), 
9 


using the identity [4] 


= // 2 - // l 2) + 


L n+l 1J n +1 Ji n JJ n 


2//» 
n + 1 


Further, it holds that 


A 2 Q d n = 12(n + 1 )(n + 2)(LT 2 - h^ 2) ) - 7/„(20n 2 + 32n - 12) + 17n 2 + 37n + 3 d r 

The previous equation is the same as 

n + 2\ fn + 1\ fn\ 

2 ) d n +2 21 ^ 1 d n +i + I 2 Jd n 

and our recurrence becomes 
(n + 1 )(n + 2)d n+2 - 2 n(n + l)d„+i + n(n - 1 )d n 

= 2 ( 12(n + l)(n + 2)(i7 2 - 77®) - i/„(20n 2 + 32n - 12) + 17n 2 + 37n + 3 d r 


Dividing by (n + 1) (n + 2), we obtain the telescoping recurrence 


(n + 2)d n+2 - (n - 2)d n+ i _ (w + l)d n+ i - ( 
n + 2 n + 1 


n 


3) d r 


H„.(20n 2 + 32n - 12) , 17» 2 + 37» 

+21 12(i/„ - A, ) - (n+1)(n+ 2) + („ + !)(„ + 2 ) 


The Maple worksheet containing the computations for the variance can be found at the web 
page: http://www.essex.ac.uk/maths/staff/profile.aspx?ID=1326 
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with solution 


(n + 2)rf n+ 2 — {u — 2)rf n+ i — (24n 2 + lOOn + 104)(.ff 2 +1 — iif n+1 ) 

-#„+i(88n 2 + 292 n + 224) + 122n 2 + 346n + 224, 

which is equivalent to 

nd n - (n - 4 )d n - 1 = (24 n 2 + 4n)(i/ 2 _i - i^i) 

—H n _i(88n 2 - 60 n - 8) + 122n 2 - 142n + 20. 

. . . (n — l)(n — 2)(n — 3) 

Again as before, multiplying both sides by-—-, the recurrence 

telescopes with solution 

f"( 1) = 4 (n + 1 ?{H 2 n+1 - H. i 2 i) - 4iJ n+1 (n + l)(4n + 3) + 23n 2 + 33n + 12. 
Using the well known fact that 

Va.r(C„) =/:(1) +/'(1) - (/:(1)) 2 , 

the variance of the number of key comparisons of dual pivot Quicksort is 

7 n 2 - 4 (n + 1) 2 ^ 2) - 2 (n + 1 )H n + 13n. (3.2) 


The asymptotic figure is 

^7 — ^7r 2 ^ n 2 — 2nln(n) + 0(n). (3.3) 

Note that the variance of dual pivot Quicksort is identical with the variance of 
the ordinary algorithm - see Eq. (32) in [3]. Also, in this paper we showed that 
the dual pivot Quicksort variant has the same expected number of key comparisons 
as the standard algorithm and as one might expect, the mean number of stages 
is smaller than the respective figure of the ordinary algorithm. However, the ex¬ 
pected number of exchanges is notably large. An efficient partitioning procedure 
is described in a paper written by Frazer and McKellar [1], where they present 
and analyse the Samplesort algorithm. It is shown [1] that the expected number 
of comparisons of Samplesort slowly approaches the Information - theoretic lower 
bound. 
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